A framework for schema matcher composition
نویسندگان
چکیده
Enterprise schemas tend to be different, which is the key issue when the seamless communication between systems is of utmost importance. One solution could be the development of standards which then could be enforced, however, vendors seem to be reluctant to comply with them and communication between existing and legacy systems still remains unsolved. Other solution could be schema matching, which resolves the matter on data level and the process do not require vendors to adhere to any kind of predefined schemas. The task is very complex on the other hand, even for human evaluators. Some of the solutions aired so far are fairly promising, however, their accuracy varies. Our goal was to find means by which the results could be enhanced. We have been focusing on the development of solutions which do not change the concept of the algorithms, but fine-tune them so that they achieve higher accuracy. Our experiments showed that the results of the matchers may vary on a large scale depending on the actual parameter settings. It has also turned out that the parameters should set for each scenario individually, as the best results are warranted only this way. In this article, we present a general approach for optimally dissembling existing solutions, and combining some of the resulting components in a way that the new matcher supersedes the donor ones. The composition and the optimal parameter setting combined provide a framework, which is capable of an enhanced performance. Improved accuracy lessens the need for the follow-up human supervision. Key-Words: schema matching, optimization, algorithm analysis, performance improvement, framework definition
منابع مشابه
An Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملYAM: A Step Forward for Generating a Dedicated Schema Matcher
Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semiautomatic, e.g., an expert must tune certain parameters (thresholds, weights, etc.). They mainly use aggregation methods to combine similarity measures. The tuning of a matcher, especially for its aggregation function, has a strong impact on the matching quality of the ...
متن کاملBoosting Schema Matchers
Schema matching is recognized to be one of the basic operations required by the process of data and schema integration, and thus has a great impact on its outcome. We propose a new approach to combining matchers into ensembles, called Schema Matcher Boosting (SMB). This approach is based on a well-known machine learning technique, called boosting. We present a boosting algorithm for schema matc...
متن کاملDYMS (Dynamic Matcher Selector) - Scenario-based Schema Matcher Selector
Schema matching is one of the main challenges in different information system integration contexts. Over the past 20 years, different schema matching methods have been proposed and shown to be successful in various situations. Although numerous advanced matching algorithms have emerged, schema matching research remains a critical issue. Different algorithms are implemented to resolve different ...
متن کاملEncore un outil de découverte de correspondances entre schémas XML ?
In this paper, we present YAM, a schema matcher factory. YAM (Yet Another Matcher) is not (yet) another schema matching system as it enables the generation of a la carte schema matchers according to user requirements. These requirements include a preference for recall or precision and a training data set (a set of expert correspondences or a domain of interest). YAM uses a knowledge base that i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010